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Abstract 


The  design  equations  for  both  single -channel  and  multi¬ 
channel  optimum  least-squares  ("Wiener")  filters  are  derived 
and  discussed.  Specific  examples  of  such  filters  are  presented; 
for  example,  inverse  filters,  s ignal/r.oise  ratio  enhancement 
filters,  prediction  filters,  and  maximum- likelihood  filters. 

The  single- channel  and  multichannel  Levinson  recursion 
algorithms  for  solving  the  design  equations  are  discussed. 
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Introduction 


Levinson  (1947)  published  an  algorithm  for  recursively  extending  the  length 
of  a  digital  filter,  optimum  in  the  least  mean-square-error  sense,  designed  to 
change  a  given  waveform  into  another  desired  waveform.  The  algorithm  was  extended 
to  the  multichannel  input  case  by  Robinson,  and  to  the  multi-dimensional  case 
by  Wiggins  (Simpson  et  al.,  1963) . 

The  general  wave-shaping  problem  Includes  as  special  cases  the  design  of 
inverse  filters,  prediction,  interpolation,  and  smoothing  filters,  and  the 
algorithm  is  required  in  the  design  of  maximum-likelihood  filters  of  various 
types  (Kelly  and  Levin,  1964;  Simpson  et  al . ,  1963).  Good  discussions  of 
optimum  filter  design  have  been  presented  by  Claerbout  (1963)  and  Treitel  and 
Robinson  (1966). 

There  are  two  advantages  to  using  the  recursion  algorithm  Instead  of 
designing  the  full-length  filter  directly:  first,  in  the  computation  of  a 
alngle-channe 1  filter  of  length  L  points,  machine  storage  requirements  are 
reduced  from  a  multiple  of  L2  to  a  multiple  of  L  words,  and  the  number  of 
arithmetical  operations  is  reduced  from  a  multiple  of  L3  to  a  multiple  of  L2. 

For  the  multichannel  filter  with  N  inputs,  the  storage  reduction  is  from 
N2L2  to  N2L  and  the  operational  reduction  is  from  N3L3  to  N2L2.  It  was  not 
generally  practical  to  design  multichannel  filters  before  the  extension  of 
the  single-channel  algorithm. 
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The  second  advantage  is  that  the  mean-square  error,  the  quantity  one  is 
minimizing ,  can  easiJy  be  calculated  at  each  step  of  the  recursion.  Since  the 
error  falls  rapidly  at  first  as  the  filter  is  extended  and  then  tends  to  level 
out,  one  can  stop  the  process  when  the  error  falls  to  decrease  significantly. 
This  saves  computation  time  not  only  during  the  filter  design  but  also  later 

when  the  filter  is  being  applied  to  data. 

Previous  statements  of  the  recursion  algorithm  (Levinson,  1947;  Wiggins, 
1965;  Wiggins  and  Robinson,  1965;  Robinson,  1963;  Treitel  and  Robinson,  1966), 
although  adequate  for  programming  purposes  and  as  mathematical  proofs  that  the 
recursion  gives  correct  answers,  do  not  seem  to  me  to  offer  the  reader  very 
much  insight  into  why  the  algorithm  works.  In  addition,  some  of  the  published 
discussions  of  the  multichannel  algorithm  contain  errors  which  are  misleading 
for  the  inexperienced  reader. 

The  purpose  of  this  report  is  to  present  a  simple,  easily  understood 
development  and  discussion  of  both  the  single-channel  and  the  multichannel 
recursion  algorithms.  Program  writeups  and  listings  are  appended. 

In  order  for  this  work  to  be  self-contained,  we  begin  with  derivation 
and  discussion  of  the  digital  filter  design  equations  the  recursion  is  intended 
to  solve. 


Single-channel  wave-shaping  normal  equations 
Given  a  digital  time  series  containing  T+l  points  y0,  ylf...,  y^,  we 
require  a  filter  of  length  L+l  points  which  does  the  best  job  in  the  least 
mean-square -error  sense  of  converting  the  inpur  data  series  y  (we  are 
entitled  to  regard  the  points  y0,  y,,...,  yT  as  the  elements  of  a  vector  y) 
into  some  other  desired  data  series  (I  with  known  elements  dg,  dj,..., 
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We  denote  the  filter  by  £  and  the  output  of  the  filter  operating  on  the  Input 

4  -4 

data  y  by  z: 


L 

l 

J-o 


Vk-j 


k  -  0,  1,...,  T+L+l  (1) 


Here  and  later  we  adopt  the  convention  that  all  vectors  are  in  a  space  of,  say, 
J  dimensions,  where  J  >>  T+L,  and  that  elements  of  the  vectors  are  zero 
outside  the  range  of  explicit  definition  here. 

In  matrix  notation,  (1)  is 


The  design  criterion  is  minimization  of  the  square  of  the  length  of  the  error 
vector 


e 


z  -  d 


(3) 
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ve  wish  to  minimiza 


E  -  I®  I 


T+L+l 

l 

t-0 


l  yt-j£j  " 

j-0  J  3 


The  minimum  value  of  E  is  attained  if  all  the  partial  derivatives  of  E  with 
respect  to  the  f.  ,  k  -  0,  !*•••*  L,  are  zero. 


T+L+l 

l 

t-0 


l  Vt-J  - 

j-o  -1  ‘  1 


T+L+l 

l 

t-0 


'«-*  [  jo  Vt-J  - 


-  0  k  -  0,1 . L  (5) 


[The  quantity  in  brackets  is  the  error  vector,  so  (5)  says  that  the  error  vector 
e  is  orthogonal  or  normal  to  y;  hence  the  name,  normal  equations.]  We  can 


write  this  set  of  equations 


L  T+L+l 

i  fj  K 

j-o  J  t-0 


yt-kVj 


T+L+l 

l  yt_kdt  k"0’1 . L 

t-o  c 


In  matrix  form  this  is 


T  -*■  T-*- 

Y  Yf  -  Id 


(7) 
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thnt  is 


(7) 


We  can  simplify  these  equations  by  Introducing  the  transient  autoconrelation 


function 


r 


M-s 


l 

t-0 


Vt+s 


C9) 


where  M  Is  the  range  of  definition  of  y.  Ve  can  define  a  vector  r  whose 
elements  are  given  by  (9).  With  the  change  of  variables  q  ■  t-k,  the  quantity 
In  brackets  on  the  left  side  of  (6)  Is 
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T+L+l 

l  Wt-j 


T4  L+l-k 

l 

q-0 


Vq+k-J 


The  bottom  limit  on  the  second  sum  can  be  taken  as  zero,  since  yq  “  0  for 

q  <  0.  Introducing  the  transient  crosscorrelation  function 


>„<■>  *  *  tl0  v 


the  right  side  of  (6)  can  be  written 


T+L+l 

l  yt-kdt 


T+L+l-q 

1  yqdt+k 

q»0 


k  ■  0,  1, . . . ,  L 


so  that  (6)  Is 


k*0,  1,  •  •  •  t  1* 


It  is  easy  to 


show  that  these  definitions  are  consistent  with  the  matrix 


notation  introduced  above.  We  can  define 


r0  rl  r2  •**  rL 
rl  r0  r\  rL-l 

R  -  TTY  -  r2  rl  *0  rL-2 


rL-l  rL-2  **• 
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so  that  (8)  and  (13)  can  be  written 

-*■  ■> 

Rf  -  g 


(15) 


Minimum  error  energy : 

The  minimum  value  attained  by  the  error  energy  ie  obtained  by  substituting 
the  solution  of  (15)  into  (4):  if  f  satisfies  Rf  -  g,  that  is,  YTY?  -  Y^d, 
and  if  the  output  is  z  ■  Yf,  then 


E 


'  1 

e  e 


(d  -  z)  (d  -  z)  - 


+T+ 

d  d 


-►T-v 

Zz  d  +  / ‘ 7 


3*3 +  H  - 
3T3  +  ;Ti  - 


-►Tt  .  -*T-* 
d  d  +  z  z 

jTj  ,  -*T+ 
d  d  +  z  z 


2fTYTd 

zn 

zf*f 

2(y1)ty1 


so 


E 


min 


-»T-y 

d  d  - 


+T-y 

z  z 


(16) 


(17) 


so  that  the  energy  in  the  error  vector  is  simply  the  energy  in  the  desired 
output  minus  the  energy  in  the  actual  output. 

The  error  energy  can  be  calculated  without  actually  constructing  the 
output  z:  in  (16)  we  can  convert  z^z  into  z^d  instead  of  the  other  way 
around,  and  since 

VT-t  +Tvrt  -T+ 
zd-fYd  -  fg 


we  have 


lin 


d  d  -  f  g 


(18) 
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E,  being  a  sum  of  squares,  cannot  be  negative;  nor  can  It  be  greater  than 
by  (17),  so  If  we  divide  through  (18)  by  (Pd  and  define 

P  -  Z/d*d  -1-^=5  (19) 

d  d 


we  are  assured  that 


0  4  P  <  1 


ao  that  P  can  be  used  as  a  measure  of  the  performance  of  a  filter  f  satisfying 
Rf  -  g;  in  particular,  as  the  length  of  f  is  extended,  the  performance  can  be 
evaluated  at  each  step  in  the  recursion  (Treitel  and  Robinson,  1966). 


The  Inverse  filter; 

Suppose  the  desired  output  dfc  is  specified  to  be  unity  at  time  t  -  s  and 
cero  everywhere  else.  If  it  were  possible  to  achieve  this  desired  output 
perfectly,  we  would  have 


l  Vt-i 


f*y  ■  (0,0, . .  • , 0, 1 ,0 , . . .  ,0) 


(20) 


and  such  a  filter  is  called  an  inverse  filter  by  virtue  of  the  resemblance 
between  (20)  and  the  definition  of  an  Inverse: 


a  ®  b  -  I 

where  I  is  the  identity  with  respect  to  the  operator  Q  . 

The  right  side  of  the  normal  equations  (15)  for  the  inverse  filter  is 
Just  the  (s+1) *  St  column  of  Y  .  The  right  side  of  equation  (8)  shows  that 
for  L  <  a  <  T+l,  g  is  an  (L+l)-term  segment  of  the  input  yfc,  in  reverse 
order:  (y  ,  y  .,..., y  ,).  For  0  <  s  <  L  or  T+l  <  s  <  T+l+1,  g  has 
fewer  than  L+l  nonzero  terms.  We  see  that  for  a  »  0  and  a  -  T+L+1,  g  has  only 
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one  nonzero  element.  The  former  case  is  the  zero-delay  inverse  filter,  which 
snt  tsf les 


y0 

0 


Rf 


0 


(21) 


0 


The  value  of  yQ  enters  these  equations  only  as  a  scale  factor,  which  does  not 

affect  the  shape  of  the  filter  or  of  its  output. 

The  zero-lag  inverse  filter  thus  depends  not  at  all  on  the  actual  waveform 

y  (except  trivially  as  a  scale  factor),  but  only  on  its  autocorrelation  r. 

-*■ 

Since  the  autocorrelation  is  symmetric  —  r^  ■  r_^  —  f  does  not  depend  on 

the  phase  spectrum  of  the  input  data.  We  might  expec:  that  the  phase  response 
•¥ 

of  the  filter  f  depends  only  upon  some  intrinsic  property  of  the  autocorrelation, 

and  it  does:  the  phase  spectrum  of  f  is  such  that  the  total  energy  in  the 

-*• 

waveform  f  i  Jammed  up  as  much  toward  the  front  of  the  waveform  (toward  ffl) 

-*■ 

as  is  consistent  with  f  satisfying  (21).  This  concept  of  minimum  energy 
delay  or  minimum  phase  is  the  subject  of  a  considerable  literature  (Robinson, 

1954;  Robinson  and  Treitel,  1965;  Robinson,  1962). 

The  performance  factor  P  of  an  inverse  filter  depends  drastically  on  the 
delay  s  at  which  the  desired  output  is  to  occur.  (Treitel  and  Robinson,  1966, 
show  examples  for  which  the  performance  factor  is  0.005  for  s-0,  but  0.860  for 
s  ~  15).  A  recursion  exists  (Simpson  et  al . ,  1963)  which  shifts  the  desired 
output  lag  a.  This  can  be  used  to  search  relatively  inexpensively  for  the 
lag  at  which  the  performance  factor  can  be  maximized  for  given  filter  length. 
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The  prediction  filter: 

Suppose  we  request  a  desired  output  at  time  t  consisting  of  the  input  itself 
p  time  units  ahead.  This  is  an  extrapolation  problem  of  considerable  practical 
interest,  for  example,  in  economics  and  weather  forecasting. 

The  desired  output  is 


Substituting  this  into  (11)  we  have  for  the  right-hand  side  vector  of  the 
normal  equations 


T 

^s^s+k 

b-0 


k-0 ,  1 , . . . ,  1. 


n  ****+?  "  rn 

8-0  r  p 


so  (13)  becomes 


I 

• 

• 

• 

CM 

M 

O 

u 

1 

fo" 

r 

P 

rl  r0  rl  *•*  rL-l 

fl 

rp+i 

r2  rl  r0  rL-2 

f2 

rp+2 

•  •  • 

•  •  • 

rL  rL-l  rL-2  r0 

fL_ 

rp+L 

(22) 


The  presence  of  a  segment  of  the  r  vector  on  the  right-hand  side  of  (22) 
Is  interesting.  A  trivial  prediction  filter  is  that  for  p-0;  in  this  case  we 
obviously  have  t  -  (1,0,..., 0),  i.e.,  t  reproduces  the  input,  as  it  was 
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told  to  do.  The  next  simplest  prediction  filter  attempts  to  predict  only  one 
unit  ahead  In  time: 


- 

- 

“  m 

r0  rl  r2  • • •  rL 

ffo 

ri 

rl  r0  rl  *•*  rL-l 

*1 

*2 

r2  ro  ...  rL_2 

*2 

•  •  • 

m 

r  3 

•  •  • 

rL  rL-l  rL-2“*  r° 

£l 

rL+l 

. 

- 

—  — 

(23) 


The  desired  output  is  dt  -  yfc+1.  The  actual  output  la 


l  Vt-k 

k-0 


(24) 


and  the  error  is 


d.  -  z. 


t+1 


k-0 


fkyt-k 


(25) 


The  form  of  the  right-hand  side  of  (25)  suggests  that  we  could  define  a 
prediction  error  filter  to  get  e^  directly,  as  follows: 


h 


0 


l; 


hj "  ~V  J " 1,2 . L+1 


The  prediction  error  filter  is  thus  one  unit  longer  than  the  prediction 
filter,  and  its  output  is 


e 


t 


L+1 

l 


k-0 


Vt-k 


(26) 
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where  t+i  in  (25)  has  been  replaced  by  t  in  (26). 

We  can  inquire  what  normal  equation8  the  prediction  error  filter  h 

satisfies:  we  write 


where  the  ,  J-0,  L+l,  are  as  yet  unknown. 
Let  us  use  the  notation 


Then  (15)  is 

Rf  -  g 

and  (27)  can  be  written 
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But  hj  -  -  f  by  definition.  So  we  have 


Multiplying  this  out, 

-*■  T  "t 

r0  -  8l  f  "  c0 

*!  “  M  "  C1 


(30) 


the  last  step  following  from  (23).  From  the  previous  discussion  on  error 
energ”,  we  know  that  g*  f  is  the  energy  in  the  predicted  output  (equation  18). 
The  zero-lag  autocorrelation  is  the  energy  in  the  input,  so  we  have 


"Tl+  -M-*-  _  - 

-r0-zz"yy-zz"E 


where  E  is  the  unpredictable  energy.  Thus  the  prediction  error  filter  satisfies 


Except  for  the  scale  factor  E,  (31)  is  the  same  as  the  normal  equations  for 
the  inverse  filter  (equation  21).  An  heuristic  argument  as  to  why  this 
should  be  so  was  given  by  Claerbout  (1963) ;  a  rigorous  derivation  was  given 
by  Robinson  (1954) . 
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The  prediction  error  filter  for  spans  greater  than  1  is  not  so  simple. 
For  example,  for  p  -  2,  we  have 


- 

-  H 

-  “ 

r0  ti  r2  ...  rL+2 

1 

c 

1 

r0  rl  rL+l 

0 

cl 

r2  r,  rQ  ...  rL 

-f0 

c2 

•  •  • 

•  •  • 

rL+2  rL+l  rL  •"  r° 

“fL 

CL+2 

- 

T 

c2  “  (c2,  c3»****cl+2^  ’ 

82 

-  (r2,r 

3  •  •  •  • 

.T 

rL+2;  ’ 

(32)  becomes 


•  * 

•  * 

1  |  -*■  T 

r0  1  *2 

1 

Jo_ 

|  1  “t 

+  I  _ 

0 

_ 

m 

_ci_ 

«2  ,  *1  1  * 

'  m 

-f 

C2 

(32) 


(33) 


Multiplying  this  out,  we  have 

*0  -  *2Tf  "  c0 
-►  T+ 

r l  -  8l  f  "  C1 


g2  -  Rf  -  c2  -  0 


(34) 

(35) 

(36) 


and  all  this  can  be  calculated  fron  f  and  r.  It  is  easier  to  construct  the 
predicted  trace  and  subtract  it  from  the  input  with  the  appropriate  shift. 
Alternatively,  one  can  construct  an  approximation  to  the  output  of  the 
prediction  error  filter  for  span  two  by  filtering  twice  with  a  unit-span 
prediction  error  filter. 
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Gencrnl  characteristics  of  the  wave-shaping  filter: 

Returning  to  equations  (14)  and  (15),  we  notice  that  the  filter  does 
not  actually  depend  on  the  waveform  y  but  only  on  its  autocorrelation 
function  and  crosscorrelation  with  the  desired  output.  This  fact  is  of 
central  importance  in  the  analysis  of  stationary  stochastic  processes  where 
the  waveform  Is  a  collection  of  random  variables,  whose  correlation  properties 
nevertheless  do  not  change  with  time;  hence  the  correlation  functions  can  be 
measured  using  one  sample  of  the  noise,  and  the  resulting  filter  will  be 
valid  for  other  samples  of  the  noise. 

Notice  also  that  the  optimum  filters  of  different  length,  say  L  and 
L+l,  do  not  contain  the  same  elements  up  through  L  terms;  for  example,  the 
one-point  filter  is 

fo  •  go/*o 

and  the  two-point  filter  is 


fo 

1 

ro80  -  rigi  ; 

*1 

fM 

U 

CM 

O 

U 

ro8l  "  rl8o 

The  Levinson  recursion  allows  us  to  construct  the  filter  of  length 
L+l  from  the  filter  of  length  Lf  Since  we  know  the  one-point  filter 
ffl  •  g0/r0,  we  can  recursively  construct  a  filter  of  any  length. 

Notice  that  if  =  0  (zero  delay  inverse),  f  is  minimum  delay. 


since  rQ  is  greater  than  r^. 
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Thc  single-channel  recursion  algorithm 

The  single-channel  normal  equations  (15)  were: 


I- 

•  ■ 

r0 

rl 

r  2  •  •  • 

rL 

f0 

8o 

rl 

ro 

rt 

rL-l 

fl 

*i 

r2 

rl 

r0 

rL-2 

f2 

g2 

rL 

rL-l 

rL-2  ’ *  * 

r0 

fL 

gL 

_ 

J 

L  J 

(15) 


R  Is  an  L+l  by  L+l  matrix;  to  store  it  requires  (L+l)2  words,  and  to  find 
1  -  R_1g  by  inverting  R  would  require  (L+l)3  arithmetical  operations.  But 
R  is  a  Toeplitz  matrix,  i.e.,  all  the  elements  on  a  given  diagonal  are  equal,  so 
that  'there  are  really  only  L+l  different  elements  in  R.  The  Levinson  recursion 
cleverly  capitalizes  on  this  fact  to  calculate  f  directly  from  the  two 

vectors  (r0,  rx .  rL)  and  (g0,  gj,...,gL)  without  ever  storing  the  whole 

matrix  R.  The  saving  in  arithmetical  operations  comes  from  the  fact  that 
extending  the  one-point  filter  to  length  L+l  requires  only  a  total  number  of 
arithmetical  operations  proportional  to  (L+l)2. 

The  operation  of  the  algorithm  is  most  easily  understood  from  a  concrete 
example.  We  will  take  L  -  2,  and  illustrate  the  extension  from  length  3  to 
length  4.  Using  primes  to  denote  the  unknown  elements  of  the  new  filter,  we 
seek  the  solution  f'  of: 
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f 

• 

r0  rl  r2  r3 

n 

*o 

rl  r0  rl  r2 

fi 

m 

*i 

(37) 

r2  rl  r0  rl 

f2 

*2 

r3  r2  rl  r0 

f  3 

*3 

m  m 

m  m 

•  m 

vhere  we  already  know  the  solution  £  of. 


(3e) 


We  aee  the  3-by-3  R  matrix  of  (38)  in  the  upper  left  corner  of  the  4-by-4 
matrix  of  (37),  and  the  $  of  (38)  in  the  upper  part  of  the  right-hand  aide 
of  (37).  This  suggests  trying  to  express  I'  In  terms  of  f  arid  some  other 
unknown  vectors 


The  reason  for  splitting  off  the  unknown  scalar  y  and  making  the  Indices  on 
■* 

l»*  go  backward  will  appear  later. 
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We  can  also  split  g  into  a  part  which  depends  on  y  and  a  part  which 
does  not: 


_  1 

■  " 

M  * 

BO 

80 

0 

8  1 

0 

- 

«2 

«2 

0 

83 

rf 

v' 

•  • 

. 

Y 


MO) 


Now  it,  v\  and  r  are  unknown.  "rf"  la  a  Fortran-Uke  ajmbol  for  "r  dot  f 

as  will  be  seen  later. 

From  (40) ,  we  have 


v'y  +  rf  -  g3 


(41) 


Substituting  (39)  and  (40)  into  (37),  we  have 


’  ro 

x\ 

X2 

r3 

£0 

"bf 

V 

■o' 

ri 

xo 

xl 

r2 

fl 

+ 

b 2 

Y 

m 

81 

+ 

0 

*2 

xi 

r0 

rl 

*2 

bi 

82 

0 

r3 

m 

x2 

*1 

r0 

« 

1 

/o 

bo 

■ 

rf 

m  * 

v’ 

•  a 

(42) 


can  separate  (42)  into  a  set  of  equations  which  depend  on  y 


r0 

rl 

r2 

k3 

’bf 

’  0  * 

rl 

r0 

rl 

r2 

b2 

■ 

0 

*2 

rl 

r0 

rl 

bl 

0 

*3 

0 

x2 

rl 

r0 

« 

b0 

■  « 

v' 

•  m 

(43) 
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and  a  part  which  is  independent  of  y: 


— 

m  - 

ro 

rl 

C2 

r3 

f0 

®o 

rl 

ro 

rl 

r2 

fl 

m 

8l 

r2 

rl 

ro 

rl 

f2 

g2 

r3 

f2 

rl 

ro 

0 

rf 

(44) 


The  top  three  equations  of  (',«)  are  just  (38)  again.  The  bottom  equation  is 


rf  - 


£  rL-k+lfk 

k"0 


(45) 


Using  (41),  we  have 

Y  -  (g ^  ”  rf ) /v ’ 


(46) 


Now  all  we  need  to  extend  f  to  f'  are  v*  and  b' .  But  the  remarkable 
symmetry  of  R  allows  to  turn  equation  (43)  upside  down  and  inside  out,  and 
identify  it  with  (31),  the  equation  defining  an  optimum  unit-span  prediction 
error  filter  -  provided  we  have  b0-l,  which  we  will  show  we  can  always  do. 

Then  v’  is  just  the  error  energy  (the  symbol  v  having  been  chosen  to  connote 
variance).  The  fact  that  (43)  is  backward  with  respect  to  (31)  is  unimportant 
in  the  single-channel  situation.  For  the  multichannel  filters,  we  shall  see 
that  predicting  backward  In  time  is  fundamentally  different  from  predicting 
forward,  the  reason  being  that  the  crosscorrelation  functions  are  not  symmetric. 

Thus  far  we  have  seen  that  in  order  to  extend  the  wave-shaping  filter  f, 
all  we  need  is  a  method  for  extending  the  backward  prediction  error  filter  b. 
We  can  anticipate  that  thi9  will  be  a  simpler  task,  since  (31)  is  so  much 
simpler  than  (8) . 
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At  each  step  of  the  recursion  we  extend  b  to  b',  calculate  v  from 
(18),  and  then  extend  f  to  £'  using  (46),  (45),  and  (39).  Thus  we  turn 
our  attention  to  the  problem  of  extending  b. 

Suppose  we  know  b  and  v  at  this  step  of  the  recursion: 


(47) 


* 

• 

“  — 

rn  rl  r2 

b2 

0 

rl  r0  rl 

bl 

m 

0 

r2  rl  rQ 

- 1 

O 

_ 1 

V 

Filling  out  S  with  a  zero  to  length  4,  substituting  it  into  (43),  and  using 
(47),  we  have: 


*0 

*1 

*2 

■ 

r  3 

0 

e 

rl 

*0 

rl 

r2 

b2 

m 

0 

*2 

rl 

r0 

rl 

bl 

0 

l'3 

r2 

*1 

*0 

b0j 

Lv. 

where  e 

is 

yet 

unknown . 

We  can  rearrange 

(48) 

as  f< 

symmetry 

of  R: 

m  m 

r 

*0 

*1 

*2 

r  3 

bo 

V 

*1 

*0 

ri 

*2 

bl 

m 

0 

*2 

ri 

ro 

rl 

b2 

0 

r3 

r2 

rl 

r0 

0 

e 

Now 

suppose 

we  mu 

ltiply  both  sides  of 

(49) 

by  a 

(48) 


(49) 


add  the  result  to  (48),  and  attempt  to  identify  the  result  with  (43): 
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The  bottom  equation  of  the  a-dependent  part  of  (50)  determines  e: 


rL-k+lbk 


e 


(51) 


Comparison  of  the  top  element  of  the  right-hand  side  of  (50)  with  the  top 
element  of  the  right-hand  side  of  (43)  determines  o: 
e  +  va  -  0 


(52) 


Similar  comparison  of  the  bottom  elements  determines  v': 
v'  -  v  +  ea  -  v  -  ez / v 


(53) 


end  so  finally: 

b*  ■  b.  +  ab,  ,  k  "  (54) 

k  k  L-k 
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Equation  (54)  is  rather  remarkable.  Notice  that  it  implies  that  is 

-*■ 

never  changed  during  the  recursion  process,  so  b  always  remains  a  unit— span 
prediction  error  operator,  provided  that  we  start  with  bg  ■  1. 

We  now  have  everything  necessary  to  extend  1. 

Summary  of  single-channel  recursion: 

Starting  values:  bfl  -  1;  f0  -  g0/r0;  v  -  rQ.  Then  for  L  -  0,1,2,..., 
compute: 

1. 


L 

Jo 


rL-k+lbk 


2.  a  ■  -e/v 

3.  b'  -  b.  +  ab.  .  k  -  0,L+1 

k  k  L— It 

4.  v'  ■  v  +  oe 


5. 


rf  *Jo  'L-k+lfk 


6.  y  "  <8l+j  -  rf)/v' 

7*  fk  "  fk  +  YbL-k+l  k-0,L+l 


Notice  that  in  order  to  construct  a  zero-delay  inverse  filter  recursively, 
ve  need  carry  out  only  the  first  four  stepB. 
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Normal  egna tlons  for  the  multichannel  wave-shaping  filter 


Suppose  wo  have  N 
Zj(t),  J  B  1*M,  related 


Input  channels  y^(t), 
to  the  inputc  by: 


1 


1,  N,  and  M  output  channels 


zj<t) 


N 


I 

i-i 


L 


I 

s=0 


fu(s) 


y^t-s) 


J-l.M  (55) 


so  that  each  Input  contributes  to  each  output  via  the  N*M  filters  f^Cs). 
Suppose  also  that  we  have  a  given  set  of  desired  output  functions  dj(t), 
J-1,M.  Then  we  seek  the  set  of  filter  coefficients  which  minimizes  the  set 
of  M  error  functions 


E 


J 


T+L+l 


l 

t-0 


lz.(t)  -  dj  (t)  ] 2 


J  -  1*M 


(56) 


[We  see  frnm  (56)  that  since  the  M  outputs  are  decoupled  as  far  as  the 
design  criterion  is  concerned,  we  could  just  as  well  work  with  the  single 
output  channel  case,  and  superimpose  the  results.] 

Substituting  (55)  into  (56),  we  have 


T+L+l 

I 

t-0 


N 

l 

i-1 


L 

I 

s-0 


f ij (s)  y1(t-s) 


' 

-  dJ(t) 


J-l.M 


(57) 


The  Ej  are  all  minimized  when  the  partial  derivatives  with  respect  to  the 

filter  coefficients  all  vanish:  9E  /3f.  (u)  -  0  for  m  -  1,M;  k  -  l.N; 

m  Km 

u  ■  0,L.  Writing  this  out,  we  have: 
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3F, 


m 


s£^(u>  '  2 


T+L+l  N  L  1 

l  l  l  fi«(8,yi(t'8)  *  V0  I  yk(t‘u)  ■ 0 

t-o  L  i-1  s-0  J 


k  -  1,N;  m  -  1,M;  u  -  0,L 


(58) 


which  we  can  write: 


II  t  T4-T  4*1 

A  i  A, 


T+L+l 

l  yk(t-u)dm(t) 


t-c 


k  -  1,H;  m  -  1,M;  u  -  0,L 


(59) 


We  can  define  correlation  functions  as  in  the  single-channel  case: 


T+L+l 

'bn'"’  *  la  yh<t)yn(t+t|) 

t-0 


(60) 


®hn<<£> 


T+L+l 

l  yh(t)d  (t+q) 
t-o  n 


(61) 


From  the  cros3-syrametry  of  the  crosscorrelation  function,  we  have 


'ho^  •  W-’5 


(62) 


Thus  we  can  write  (59): 


L 


I 

s-0 


V> 


ik 


(8-U) 


8i«(u) 


k-l,N 

m-l,M 

u-0,L 


(63) 
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We 


can  use  (62)  to  make  this  set  of  equations  resemble  a  matrix  equation: 


l  l  rkl(U-9)f^<8)  -  g,_(u) 


i*l  sb0 


lm 


skm 


k«l,N 

m*l,M 

u«0,L 


(64) 


As  an  illustration,  the  set  of  equations  for  N»2,  L-2,  M-2,  is: 


*11 (°)  *12 (0) 

rn  (-D 

*12  (-D 

« 

*11 (-2)  *12 (-2) 

fll(0)  f  12  (0) 

8ll(0)  gi 2 (0) 

*?l  (0)  r22(0) 

r21 ("I) 

*22 ("1) 

*2 1 ("2)  r22 (-2) 

f2l(0)  f22(0) 

821 (0)  g22(0) 

r  1 1  (1)  *12(1) 

*11 (0) 

*12(0) 

*11 (-1)  *12  (-D 

fll(l)  f  12  (1) 

gll(D  gi2d) 

*2 1 (1)  *22 (1) 

*21 (0) 

*22  (0) 

*2 1 (~1)  *22 ("1) 

r2 id)  f22 (i) 

g2i(l)  822(1) 

*11 (2)  ri2 (2) 

*11 (1) 

*12(1) 

*11  (0)  r12(0) 

1 1 1 (2)  fl2 (2) 

8l l  (2)  812(2) 

*21 (2)  r22 (2) 

*21 (1) 

*22  (D 

- 1 

O 

CM 

CM 

u 

O 

CM 

u 

^21  (2)  f22(2) 

mi 

821  (2)  822(2) 

(65) 


The  dashed  partition  lines  we  have  drawn  in  (65)  suggest  a  matrix  formulation: 
define 


ril(s)  *12  (s)  ...  r^s) 


r2i(s)  *22  (®)  •••  r2N(s) 


rN1  (8)  rN2^8^  *•*  *u%j(S) 


L  N1 


NN 


(66) 


fll(8)  f  1  2 (®)  •••  f  (®) 

f  2 1  (®)  f22(g)  •••  f2M^ 


fNl(8)  fM7(s)  f„u(s) 


N2 


NM 


(67) 
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8l i  <s>  8i2<8>  “•  *1M(S) 

g21(s)  g22(s)  •••  82M(8) 

gjjlC*)  **•  8nM*8* 


(68) 


We  can  now  define  supermatrices  (matrices  whose  elements  are  themselves  matrices): 


R 


- 

e 

- 

« 

r0 

r-i  r_2 

•  *  *  r-L 

ro 

T 

rl 

T 

•  •  • 

T 

rL 

rl 

ro  r-i 

*  *  *  r-L+l 

m 

ri 

r0 

T 

•  •  • 

T 

Vi 

r2 

rl  r0 

*  *  *  r-L+2 

r2 

rl 

ro 

T 

rL-2 

rL 

rL-l  rL-2 

...  Tq 

rL 

rL- 

-lrL-2... 

ro 

m 

k 

(69) 


80 

F  - 

fi 

G  - 

8l 

h 

82 

fL 

*L 

>  * 

(70) 


From  (62)  we  see  that  R  is  symmetric  and  block-Toeplitz.  The  normal 
equations  (64)  become  simply: 


RF  -  G 


(71) 


-27- 


where  R  is  an  L-by-L  matrix  whose  elements  are  each  N-by-N  matrices; 

F  and  G  are  both  column  vectors  each  of  whose  elements  is  an  N-by-M  matrix. 

Recursion  for  the  Multichannel  filter 
The  Levlnson-Roblnson  recursion  for  the  multichannel  waveshaping  filter 
goes  as  above,  with  the  following  exceptions: 

1.  Elements  of  the  normal  equations  are  matrices,  not  scalars. 

2.  Since  crosscorrelations  are  not  symmetric,  superdiagonal  elements 

T 

of  R  must  be  written  r^  or  r_^,  rather  than  r^. 

3.  Since  matrices  do  not  necessarily  commute,  we  must  be  careful 
about  the  order  of  elements  in  a  product. 

4.  For  reciprocals  we  need  to  use  inverses. 

5.  Most  important,  the  backward  multichannel  prediction  error  operator 
is  not  simply  the  time  reverse  of  the  forward  prediction  error 
operator.  We  find  that  instead  of  one  auxiliary  sequence  b  ,  we 
will  need  two,  a  and  b,  which  correspond  to  the  forward  and  back¬ 
ward  prediction  error  operators.  Instead  of  the  single  variance 

and  error  terms  v  and  e,  we  will  need  a  variance  and  an  error 
term  for  both  the  forward  and  the  backward  prediction  error  operators. 
As  before,  we  illustrate  the  recursion  for  L  ■  2.  We  seek  the  solution 
f'  of: 
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• 

■“  * 

T  T  T 

f  t 

to  tj  r2  r^ 

f0 

80 

T  T 

C  t 

ri  r0  r!  r2 

fl 

Si 

m 

T 

r2  rt  r0 

n 

82 

r3  r2  rt  r0 

83 

L  J 

i  j 

_ 

where  we  know  the  solution  f  of; 


-  ■ 

»  * 

T  T 

r0  rl  r2 

f0 

80 

T 

rl  r0  rl 

fl 

- 

8l 

r2  rl  r0 

*2 

82 

■  0 

m  0 

(72) 


(73) 


and  where  now  the  r's,  f's,  and  g’s  are  given  by  (69)  and  (70) 

-*  -+■ 

As  before,  we  express  f'  in  terms  of  f  and  an  unknown  correction: 


- 

fo 

V 

m 

fi 

+ 

b2 

f2 

w 

*3 

0 

,  , 

m  m 

m  m 

(74) 


-r  ^ 

where  now  y  and  the  elements  of  f  and  f'  are  N-by-M  matrices,  and  the 
elements  of  b'  are  N-by-N  matrices. 

We  split  g  into  a  part  which  depends  on  y  and  a  part  which  does  not: 


"  - 

•  - 

g0 

8o 

0  . 

at 

61 

+ 

0 

82 

82 

0 

83 

rf 

vb’ 

. 

■ 

(75) 
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where  rf  is  an  N-by-M  matrix  and  vb*  is  an  N-by-N  matrix.  *vb  connotes 
"variance  on  the  bottom";  we  will  later  need  a  variance  on  the  top,  error 
on  the  top,  and  error  on  the  bottom. 

The  bottom  equation  in  (75)  gives 

g3  -  rf  +  vb* y 

so  we  have 

Y  -  (vb’rtga  -  rf)  (76) 

We  now  substitute  (74)  and  (75)  into  (72): 


The  Y“<*ePen^ent  Part  (77)  is 
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The  top  three  equations  of  (79)  ere  just  (73)  egeln.  The  bottom  equation  of 
(79)  determines  rf: 


rf 


t 


rL-k+l 


(80) 


Now  all  we  need  to  calculate  f'  from  J  are  vb'  and  b\ 

Since  the  elements  of  R  now  are  matrices  containing  crosscorrelations, 
we  cannot  rearrange  (75)  in  the  same  way  we  got  (49)  from  (48).  It  turns  out 
that  to  pull  the  same  trick  as  was  used  in  forming  (50)  and  comparing  it  to 
(43),  we  must  define  a  forward  prediction  error  operator  a\  which  satisfies: 


T  T  T 
ro  rl  *2  ra 

T  T 
rl  r0  rl  r2 


1 


1 


r3  r2  *1  ro 


1 

*■  1 

-  * 

4 

Vt' 

ai 

0 

a2 

0 

i  i 

a3 

i  °  J 

m 

(81) 


where  vt'  ^  vb' .  (81)  will  play  the  same  role  equation  (*9)  did  in  the 

single-channel  case.  Thus  two  auxiliary  operators  are  required  for  the  recursion 
procedure,  whereas  the  single-channel  case  required  only  one. 

Suppose,  then,  that  at  this  stage  a’,  b' ,  vt’,  and  vb'  are  unknown,  but 

4  4 

that  we  know  vt  and  vb,  and  a  and  b  which  satisfy. 


- 

m  ■ 

•  “ 

T  T 
r0  rl  r2 

a0 

vt 

T 

rl  r0  rl 

al 

- 

0 

r2  rl  r0 

a2 

0 

_  • 

. 

- 

T  T  ‘ 
r0  rl  rZ 

’  b2" 

0 

T 

rl  r0  ri 

bl 

- 

0 

r2  rl  r0 

bo 

vb 

;(  m 

■  a 

- 

(82) 


(83) 
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We  now  fill  out 
we  have: 


a 


with  a  zero  matt  lx  and  substitute  Into  (81);  using  (82) 


• 

- 

*  ■ 

X  T  T 
r0  rl  r2  r3 

a0 

vt 

T  T 

rl  r0  rl  r2 

al 

0 

m 

T 

r2  rl  r0  rl 

a2 

0 

r3  r2  r  j  r0 

0 

eb 

J 

■* 

(84) 


where  eb  Is  an  unknown  N-by-N  matrix.  Similarly  extending  b,  substituting 
Into  (78),  and  using  (83),  we  have: 


r0 
r  1 
r2 
r3 


- 

- 

•  * 

T  T  T 
rl  r2  r3 

0 

et 

T  T 

r0  *1  *2 

b2 

m 

0 

T 

rl  r0  rl 

bl 

0 

r2  rl  r0 

b0 

vb 

a 

•  a 

_  a 

(85) 


where  et  is  an  unknown  N-by-N  matrix. 

Now  we  postmultiply  (84)  by  an  unknown  N-by-N  matrix  a,  postmultiply  (85) 
by  an  unknown  N-by-N  matrix  B,  an’  add: 


“ 

- 

/ 

m  x 

■ 

• 

- 

ro 

T 

rl 

T 

r2 
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The  top  equation  of  the  8-dependent  part  of  (86)  determines  et: 
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and  the  bottom  equation  of  the  n-dcpcndent  part  of  (86)  determines  eb: 
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kH0 


(88) 


Now  let  us  postmult i p 1 y  (78)  by  6,  postmultlply  (81)  by  a,  and  add: 

/ 


T 

l 


T 

:2 


T 

'3 


T  T 
ri  r0  ri  r2 


r2  ri  ro  rl 


r3  r2  rl  r0 


- 

-  “ 

a0 

b3 

\ 

vt' 

0 

al 

K 

1 

0 

0 

a  + 

8 

■i 

a  + 

w 

0 

0 

a3 

bo 

0 

vb' 

«•  m 

i 

m  m 

»  m 

(89) 


We  now  Identify  (89)  with  (86).  To  do  this,  It  turns  out  to  be  necessary  to 
select 

(vt)a  +  et  -  0 


or 


and 


or 


a  •  -(vt)  *  et 


(vb)8  +  eb  ■  0 


(90) 
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Substituting  (90)  and  (91)  Into  (86),  we  have 
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We  can  pet  rid  of  the  -et  and  the  -eb  simply  by  adding  (84)  and  (85)  to  (92): 
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Now  our  Identification  is  complete  if  we  put: 

vt’  *  vt  +  (et)B 
vb’  «  vb  +  (eb)a 


(94) 

(95) 
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+  bL-k+l6 


k  -  0,L 


(96) 


bk  +  aL-k+l° 


k  -  0,L 


(97) 


A  time-saver: 

The  fact  that  the  matrix  et  is  always  the  transpose  of  the  matrix  eb  was 

proved  by  J.  P.  Burg  in  1962  (personal  communication)  and  much  more  simply  by 

D.  W.  McCowan  (personal  communication)  in  1966  as  follows: 

T  T  T 

Premultiplying  (84)  by  (0,  b2,  bj,  bg).  we  have 
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the  last  equality  following  from  the  fact  that  we  have  bQ  -  I  throughout 

T  T  T 

the  recursion  process.  Similarly  premultiplying  (85)  by  (a0,  ap  a2,  0)  we 
have 


the  last  equality  following  from  the  fact  that  we  have  a„  -  I  throughout  the 
recursion  process.  Since  R  -  RT,  it  follows  that  the  left  aide  of  (98)  is  the 
transpose  of  the  left  side  of  (99),  and  hence  that 

et  -  ebT  (WO) 
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7. 

vt* 

■  vt  +  (et)8 

8. 

vb* 

■  vb  +  (eb)a 

L 

9. 

rf 

■  j0Vk+lfk 

10. 

Y 

-  -  rf) 

11. 

K 

“  fk  +  bL-k+lY 

Sorae  special  cases  of  multichannel  filters 

Multichannel  prediction  filter:  This  predicts  the  input  channels,  making  use 

of  the  crosscorrelation  between  channels.  For  examples,  see  Claerbout  (1964) . 
We  take  M  •  N,  and 

dj(t)  -  yj (t  +  p) 

where  p  is  the  prediction  span.  The  normal  equations  are: 
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Array  sum  prediction: 

This  predicts  the  array  sum.  For  a  single  output, 
1  N 

d(t)  -  £  I  yt(t  +  p) 

"  i-1 
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where  p  is  the  prediction  span.  We  can  predict  the  array  sum  at  several 
different  spans  simultaneously: 


V°  ’  N  yl(t  +  Pj) 


J  -  1*H 


where  p^,  j-l,M,  are  a  set  of  M  different  prediction  spans, 
The  right-hand  side  contains  matrices  g  ,  s  ■  0,L: 


Vlj  ■  N  rik(fl  +  PJ> 


Spatial  interpolation:  We  take  as  input  N-l  of  the  N  input  channels,  and 
as  desired  output  the  remaining  input  channel.  The  normal  equation  matrix 
Is  L* (N-l)-by-L* (N-l) ,  and  the  right-hand  side  contains  matrices  gfl,  s-0,(., 
where 

Vl  ’  rlN(s) 

The  frequency  spectrum  of  the  interpolation  error  output,  formed  by  subtracting 
the  N'th  channel  from  the  interpolation  filter  output,  can  be  used  to  determine 
the  spatial  coherence  of  the  data  (Flinn  and  McCowan,  1967). 

Equalization:  The  "equalization”  filter  simply  attempts  to  convert  each  input 

trace  to  the  array  sum,  and  hence  is  identical  to  the  N-output  array  sum 
predictor  at  zero  prediction  span. 

The  equalization  error  output  is  a  measure  of  the  variability  of  the 
waveform  across  the  array  (for  an  example,  see  Backus,  1966). 

Slgnal-to-nolse  ratio  enhancement:  Suppose  that  the  input  channels  consist 
of  signal  mixed  with  noise,  and  we  wish  the  output  channels  to  contain  only 
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the  signal.  This  situation  can  be  handled  within  the  framework  we  have 
outlined  above. 

The  inputs  are: 

y^(t)  *  n^(t)  +  s^(t)  i  ■  1|N  (101) 

where  n  (t)  represents  the  noise  and  s.(t)  the  signal;  we  assume  that  the  two 
have  different  correlation  properties  across  the  array.  The  desired  outputs 
are: 

d^t)  =  9j(t)  J  -  l.N  (102) 

We  may  ask  for  fewer  output  channels  in  order  to  get  a  more  effective 
estimate  of  the  signal.  For  example,  if  the  signal  is  a  wave  travelling 
across  the  array,  we  might  ask  only  for  the  time-shifted  array  sum: 

d(t)  -  ^  l ^  eA(t  -  c£)  (103) 

where  c^  is  the  time  delay  appropriate  to  the  i’th  channel. 

The  normal  equations  are  set  up  and  solved  as  before.  Frequently  we 
measure  the  noise  correlation  functions  and  use  a  theoretical  model  for  the 
noise.  In  this  case  we  have  an  adjustable  parameter  in  (101),  which  becomes: 

y^(t)  ■*  n^(t)  +  Xs^(t)  i  ■  l.N  (101a) 

where  X  determines  the  signal-to-noise  ratio.  If  the  actual  inputs  have  a 
different  X  than  that  for  which  the  filters  were  designed,  the  performance 
is  degraded. 
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TIh‘  opt  I  mum  linear  filter  with  cons  t  ra  I  ni 


The  previous  filters  were  designed  using  no  criterion  other  th.m  the 
eff  It'  ieuev  of  converting  one  waveform  into  another.  In  some  cases,  however, 

It  is  desli.ihle  to  impose  constraints  on  the  filter. 

As  a  simple  example  of  the  single-channel  filter  with  a  constraint,  we 
consider  the  maximum  output  energy  filter  (Claerbout,  1961).  The  design 
criterion  Is  simply  that  when  the  input  is  a  certain  waveform  x(t),  the 
energy  In  the  output  should  be  as  large  at,  possible.  Obviously  some  sort  of 
constraint  Is  needed  to  keep  the  filter  coefficients  finite.  We  can  choose 
to  require  that  the  energy  in  the  filter  be  unity: 


l  fJ<o  •  i 

t-0 


(104) 


The  output  energy  Is: 


T+L+l 

E  -  l  *?U) 
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Using  a  Lagrange  multiplier,  we  seek  to  maximize 


Q  -  E  +  X 


1  -  l  f2(t> 


t-0 


(106) 


Using  the  vector  and  matrix  notation  set  up  earlier  (equations  2  and  2a), 

we  have:  -*r  +  -vr 

Q  -  Z  Z  +  *  (1  -  f  f) 

-V  T  +  -*T 

-  (Yf)  Yf  +  X  (1  -  f  fl 

-*-r  T  -*  "►T  ■> 

-  pYTYf  +  x  (1  -  fl  fl 
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Uslng  (14), 


-*T  ■>  -*T  . 

Q  .  ?  Rf  +  Ml  -  f  fl 


(107) 


Now 


3Q/3X  =  0  gives  us  the  constraint,  and  3Q/3 f ^  ”  0  gi,Tes: 


12-  -  Rf  -  Xf  -  0 

3fJ 

or  Rf  -  X? 

which  is  a  simple  eigenvalue  equation.  Me  see  that  X  is  the  largest  eigenvalue 
of  the  R  matrix,  and  f  is  the  corresponding  eigenvector.  If  we  had  chosen 
to  minimize  the  output  energy  when  the  input  is  x(t),  we  would  find  the  smallest 
eigenvalue  of  R  and  the  corresponding  eigenvector. 

The  actual  output  energy  for  this  filter  is: 

-kt  -*■  -+t  -*■  ,  ~*T 

E  ■*  z  z  *  f  Rf  ■  If  f  *  X 


so  that  the  eigenvalue  is  the  output  energy. 


-Vf  ■* 


We 


could  have  arrived  at  equation  (108)  by  arguing  that  maximizing  z  z 


subject  to  the  constraint  f^  f  ■  1  is  equivalent  to  maximizing  the  ratio 
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(109) 


In  a  slightly  more  complicated  version  of  this  filter,  we  not  only  ask 
that  the  output  be  as  large  as  possible  (in  the  energy  sense)  when  x(t)  is  the 
input,  but  also  that  the  output  be  as  small  as  possible  when  some  other 
waveform  w(t)  is  input.  This  is  equivalent  to  maximizing  the  ratio 
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wllon.  W  is  a  matrix  formed  from  w(t)  in  the  same  way  that  Y  was  form’d  from 
y(t)--*;pe  equation  (2a) 

Writing  the  autocorrelation  matrix  of  W  as  P  -  W  W,  (110)  becomes 
£T  (R?  +  XP^)  «  0 

♦  -*  '1111 
or  Rf  +  XPf  =  0 

which  is  a  generalized  eigenvalue  problem.  (Ill)  can  always  be  solved,  since 
correlation  matrices  are  non-negative  definite.  If  L  is  not  too  large,  we 
can  covert  (111)  into  (108)  by  premultiplying  by  P  : 


(P_1R)f  +  Xf 


(112) 


The  maximum-likelihood  filter: 

The  maximum-likelihood  or  minimum-variance  filter  (the  two  terms  are 
synonymous  if  che  input  is  a  Gaussian  random  process)  was  designed  by  Levin 
(Kelly  and  Levin,  1965).  It  is  an  example  of  a  multichannel  filter  with  a 

constraint . 

We  assume  a  single  output  channel,  and  ask  that  the  energy  in  it  be  as 
small  as  possible,  i.e.,  we  wish  to  minimize 
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(113) 


We  obviously  need  a  constraint  to  keep  the  filter  coefficients  from  turning 
out  to  be  zero.  Levin  suggested  a  linear  constraint  based  on  prior  knowledge 
of  the  nature  of  the  input.  In  Levin's  case  the  input  channels  consisted  of 
signal  and  noise,  but  the  signal  was  known  to  occur  simultaneously  on  all  the 
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channels  will;  the  same  waveform  and  amplitude: 


y^(t)  =  n1(t)  +  s ( t ) 


i  -  1  »N 


(114) 


Levin’s  constraint  was  that  the  sum  (across  channels)  of  the  filter  coefficients 
should  be  a  single  spike  at  t  =  0: 


l 


1=1 


f^(t)  9  h(t)  =  (1,0,0, ... ,0) 


(115) 


Using  this  constraint,  the  output  is: 


z(t)  =  [  I  u  (t-r)  f£(T) 

i=l  T  =0 


N  L  N  L 

T  i  n  (t  -  t)  f . (t)  +  l  l  s(t-T)  £  (t: 

ill  rlo  1  1  i-1  t-0 


or 


z(t)  *  l  l  n  (t  -  t)  ft(x)  +  N  s(t) 
i-1  t  =0 


(116) 


Thus  the  effect  of  minimizing  the  power  in  the  output  by  varying  the  filter 
coefficients,  under  the  constraint  (115),  is  to  reduce  the  first  term  on 
the  right:  side  of  (116)  relative  to  the  second  term  (since  the  second  term 
does  not  depend  on  the  filter  coefficients  at  all)--i.e.,  the  effect  is  to 
increase  the  signal-to-noise  ratio.  Kelly  and  Levin  showed  that  the  output 
signal  is  unbiased,  i.e.,  that  the  signal  comes  through  undistorted.  The 
filter  can,  of  course,  be  designed  so  that  the  output  signal  arrives  with 
anv  desired  delay,  i.e.,  the  right  side  of  (115)  might  be 


-1*2 


h(t)  *  0,0. ...0,1,0,... 


Ac  tun  1 l y ,  nny  waveform  h(t)  cou’d  be  used  as  a  constraint,  e.g.,  the  Impulse 
response  of  some  desirable  bandpass  prof  I  Iter  could  be  used  as  h(t),  and  this 
would  save  doing  the  pref llterlng.  It  will  be  apparent  later  that  there  are 
compensating  advantages  In  using  an  h(t)  which  has  a  single  nonzero  element. 

The  normal  equations  for  the  maximum-likelihood  filter  are  conaiderably 
•tore  complicated  than  in  the  case  of  the  wave-shaping  filter.  Using  Lagrange 
Multipliers,  the  quantity  we  minimize  is: 
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(117) 


where  we  have  split  off  a  factor  or*  (-2)  from  the  Lagrange  multipliers  for 
later  convenience. 

Writing  out  3Q/3fk(u)  -  0,  k-l,N;  u-0,L,  gives: 


T+L+l  N  L 

III  y^t-s)  Ms)  y.(t-u)  -  xu 
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(118) 


3Q/3>  -  0,  u-0,L,  gives: 
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Using  (60)  ;md  (62),  wc  can  write  (118)  as: 


N  1, 

l  l  rkJ(s-u)  fl^s^  =  Xu  u  "  °»L  (120) 

i*=l  s^O 


Wc  define  R  and  F  as  in  (69)  and  (70),  as  well  as: 
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and  a  summing  matrix 
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(121) 


(122) 


an  L-by-N-L  matrix,  each  row  of  which  contains  N  adjacent  ones  distributed 
as  shewn.  S  can  be  thought  of  as  a  stretched  unit  matrix.  Using  (122),  we 
can  write  (119)  as: 


STF  -  H 

and  we  can  write  (120)  as: 


(123) 


RF  -  SA 


(124) 
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Wc  need  to  solve  (123)  and  (124)  for  F.  Observe  first  that 
STS  =  NI 

where  1  Is  the  Identity  matrix.  Starting  with  (123),  we  have: 

STF  =  H 
STR_1RF  *=  H 

STR_1SSTRF  -  NH 

STRF  =  N(STR'1S)'1H 

strf  -  sts(str"1s)'1h 


(125) 


which  Is  satisfied  If 

-1  T  -1  -1 

F  -  R  lS(SR  S)  AH 

This  can  be  abbreviated 

F  -  P(STP)_1H 
where  „  _  D-lc 


(126) 


(127) 

(128) 


Notice  that  P  is  a  sort  of  optimum  multichannel  filter  in  its  own  right, 
since  it  is  a  solution  of  the  normal  equations  Rn  -  S.  Because  of  this  we 

can  compute  P  using  the  multichannel  recursion,  and  form  F  by  selecting 
the  j'th  column  of  P(STP)-1,  where  the  single  nonzero  element  of  H  is  the 

j'th  element. 

Using  (126),  the  output  energy  is: 


When  H 
Is  thus 


E 


> 

z  z 


(yf)tyf  -  nht(str-1s)-1h 


(129) 


contains  only  one  nonzero  element  (say  the  J'th),  the  output  error 

T  -1  -1 

N  times  the  J'th  diagonal  element  of  (S  R  S)  ,  so  we  can  compute 


the  filter  performance  without  actually  calculating  the  output. 
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